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Abstract 


This paper presents deep learning models for the classification of Diabetic Retinopathy (DR) grades. The 
goal of this research is to find and create a deep learning model that will help us identify the image with 
high accuracy into one of the five phases of the DR as no DR, mild, moderate, severe, and proliferative 
DR.The whole work is developed using four steps. The first, using Ben Graham's pre-possessing form, the 
fundus images were pre-processed. Secondly, in order to train the models, the preprocessed images are 
contributed to the deep learning algorithm. The third,deep learning models such as Deep CNN, Dense Net, 
and Group 19 Visual Geometry (VGGI9) are developed to predict the severity of the DR. The APTOS 
Blindness Detection dataset is used to train the proposed deep learning models. Since the data set is 
imbalanced in nature, the issue of training bias contributes to it. Therefore, at the time of training the 
models, class weight technique is used to eliminate the training bias problem. In the case of DR grading 
structures, the proposed deep learning models work well. The Dense Net has been found to work better 
than the other two models. 

Keywords: Diabetic retinopathy, fundus image, deep learning class weight 


1. Introduction 
In 2019, the global prevalence of diabetes is 


attempts have made good progress using image 
detection, pattern recognition, and machine 


estimated to be 9.3% (463 million individuals) and 
to increase to 10.2% (578 million) by 2030 and 
10.9% (700 million) by 2045[1]. DR is an eye- 
related condition that occurs in persons with 
prolonged (1.e. more than 20 years) diabetes.And it 
is becoming one of the major vision related 
problem in these days. The large amount of 
population is suffering from various stages of DR 
like no DR_ mild, moderate, severe and 
proliferative DR. 

It is therefore now important that the frequent pre- 
screening Computer Aided Diagnosis (CAD) 
method for the DR is to be adopted [2].It has long 
been known that there is a need for a robust and 
automated DR screening process, and previous 
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learning with eye photos as input. The purpose of 
this work is to build a new paradigm that will 
hopefully lead to practical clinical potential. And 
to carry out DR screening process, these models 
can be used. But we need qualified and educated 
experts for the screening process. One survey was 
conducted between 2015 and 2019 to find out the 
number of ophthalmologists around the world.And 
the survey revealed that in 194 countries, there 
were just 25 thousand ophthalmologists [3]. Thus, 
now a day the CAD based approaches are 
becoming the most widely used approach. 

In the automatic grading of DR using CAD at the 
time of screening, three major problems remain. 
The first is that most of the CAD programs 
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currently available support the diagnosis of DR in 
only two grades, i.e. normal and abnormal. In fact, 
however, the DR is progressing through five 
different phases [4], such as no DR as first, mild as 
second, third stage as moderate DR and severe and 
proliferative DR are at fourth and fifth stage 
respectively. The second is the optimal overall 
accuracy of multi-class classification, and the third 
is the imbalanced dataset. The imbalanced data set 
is a dataset that does not have uniform features, 
such as number of samples belonging to each 
class. This means that changes in our network had 
to be made to ensure that the functionality of these 
photos could still be taught. There are also very 
few articles addressing the five classifications of 
DR using a CNN technique, as far as we are 
aware.We present deep learning models in this 
paper, such as DenseNet'’, basic CNN, and 
VGGI19. Figure | displays the pictorial view of the 
model commonly proposed. To predict the severity 
of the DR, this model takes input as fundus images 
and applies a trained model on the input fundus 
image. The proposed deep learning modelsadopts 
the following steps: 

e Applying image preprocessing techniques is 
the first step in designing and training the 
model. 

e We have applied image resize and image 
cropping at the second stage in the model 
creation process., 

e The construction of deep learning models is the 
third and most important step.. 

e In the fourth level, class weight approach is 
adopted to deal with training bias problem due 
to imbalanced dataset. 

e Finally, we assessed the performance of the 
built models. 

The paper is organized into different sections. The 

related work done by researchers in this field is 

clarified in Section 2.A description of the various 
methods of detection and classification of DR is 
given in this section.Detailed information on the 
data set used to train and test the deep learning 

models 1s provided in Section 3. In the Section 4 

various techniques adopted for the pre-processing 

of the fundus images are defined. Also Section 4 

includes description about the construction of deep 

learning models, Results obtained were established 
in Section 5 and Conclusion is in the last Section 

6. 
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Input Fundus Image 


Pre processing 
Resz and Crop 











Fig.1. Diabetic Retinopathy prediction system 
using deep learning Model. 

2. Related work 
S Gayatri et al.[5] suggested work in 2020 based on 
Haralick and Anisotropic Dual-Tree Complex 
Wavelet ‘Transform extractionf (ADTCWT) 
Multiple classifiers such as Support Vector 
Machine (SVM), Random Forest, Random Tree, 
and J48 classifiers are tested. And, with an overall 
accuracy of 99.7 percent, Random Forest has been 
found to outperform all the other classifiers.In 
2020Gayatr S., et al.[6] for the classification of the 
DR from the fundus images suggested a light 
weight CNN. And the assessed results show that, 
along with the J48, the proposed feature extraction 
technique is  betterAn updated color  auto- 
correlogram function (AutoCC) with low 
dimensionality was proposed by 
Raghav Venkatesan et al.[7] in 2012. 
Jaakko Sahlsten et al.[8] suggested deep learning 
fundus image processing for the classification of 
DR and Macular Edema in 2029. They introduced 
a deep learning framework in this research that 
recognizes referable DRs. 

An automatic detection of mild and multi-class 
DR using deep learning was proposed by 
RubinaSarki et al.[9] in 2020. 

In 2019 Karthikeyan S et al. [10] proposed a 
model for detection of Multi-Class retinal diseases 
using artificial intelligence. The proposed model 
uses minimal data to train the CNN. Yung-Hui Li et 
al.[11] suggested a CAD framework for DR based 
on fundus photos using deep CNN imagery in 
2020.In 2012 Man Li et.al. [12] Proposed a 
common approach to handle imbalance dataset 
training bias issue in which technique is adopted 
to weighting samples in rare classes with high cost 
and then apply cost-sensitive learning strategies to 
fix the class imbalance issue. 
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The new state-of-the-art DR color fundus image 
detection and classification methods using deep 
learning techniques were reviewed and evaluated in 
article [13] and [14-19].Different machine learning 
techniques such as Random Forest, SVM, etc. have 
been applied to enhance the performance of DR 
detection, as seen from the literature. However, 
there is still scope to explore more relevant features 
that can contribute to the identification of a DR 
image. 


3. Dataset 

The "Asia Pacific Tele-Ophthalmology Society 
(APTOS) Blindness Detection" dataset [15] is used 
to train and test deep learning models. The dataset 
includes 3662 total number of color fundus images. 
And these images are graded ranging from 0 to 4 
for the DR _ by clinician. For classification 
problems, imbalanced datasets are a special case 
where the class distribution between classes is not 
uniform. Fig.2 [16] shows the distribution of the 
dataset per DR grade. 


Distribution of Output Classes 


Number of Occurrences 





No DR Mild Moderate Severe Proliferate 
Target Classes 


Fig.2. Distribution of number of samples per 
DR grades 

4. Methods 

The numerous methods proposed and 
implemented to implement the deep learning 
models to classify the fundus images into different 
grades of the DR are listed in this section. This 
paper suggested different deep learning models 
such as Base Model (CNN only), VGG19 Model, 
and DenseNet Model. 


Fundus images, preprocessed using the Ben 
Graham system, are the input to train these models. 
The dataset is created using fundus photography 
under a range of imaging conditions with a large 
collection of retina images. We will have noise in 
images like any other real-world dataset. In 
addition, photographs were obtained over an 
extended period of time from several clinics using 
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a range of cameras, which would introduce more 
variations. Therefore, it is important to adopt and 
apply pre-processing techniques on fundus images. 


4.1 Fundus image pre-processing 


This sub section of the paper explains in detail 
the different techniques used in the image pre- 
processing process. 


1) Gaussian Blur/Smooth: To blur or smooth the 
input fundus images, the Gaussian filter is used. 
This is similar to how the average filter works, 
except for the Gaussian filter, a different kernel 
is used. 

2) Ben Graham approach: The Ben Graham 
approach[17] for the pre-processing of the 
representations of the input fundus is adopted. 
Graham performed both scaling the image and 
adding a circular crop with the input fundus 
image. The Graham fundus image pre- 
processing approach are as follows: 

e Rescale the images so that they have the same 
radius, 1.e. (300 pixels or 500 pixels), 

e The local average color is subtracted; 50 percent 
grey 1S mapped to the local average. 

e Clipping the images to a size of 90% to 
eliminate the boundary effects. 

3) Image Cropping: Images have a black section 
around the actual image of the eye in the dataset. 
The black portion impacts the model's output 
because it contains no data. So we need to cut 
this black portion out of the picture.. 

4) Resizing: The images in the dataset vary in size. 
So we used the radius of an image equal to 500 
pixels to render images of the same size. The 
pre-processed image is shown in Fig. 3. (b) after 
applying all image pre-processing techniques 
such as Gaussian blurring and Graham's 
proposed methods. 

4.2 Building deep learning models 
This section presents various deep learning 
models like simple basic CNN, VGGI19, and 
DensNet. Pretrained models are modified and 
constructed with by applying class weight 
method training techniques to overcome issue of 
training bias. 

A) Simple Basic Deep CNN:-In deep learning, a 
CNN may be a category of deep neural 
networks, most typically applied to analysing 
visual imaging. The constructed model is trained 
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for 15 epochs with batch normalization equal to 
128. 





(a) (b) 
Fig.3. Preprocessing (a) Input Fundus image (b) 
Preprocessed fundus image. 

B) VGG19:- A CNN with 19 layers deep is called 
VGGI19. We can load a pre-trained version of 
the network that is trained on more than a 
million ImageNet database images. It’s the first 
contender in ImageNet Challenge in 2014. [18]. 
This network is distinguished by its simplicity, 
using an increasing depth of just 3*3 
convolution layers stacked on top of each other. 


C) DenseNet:- A DenseNet may be a kind of CNN 
that, via Dense’ Blocks, utilises dense 
connections between layers wherever we appear 
to directly connect all layers with each other. 
Each layer obtains additional inputs from all 
previous layers to maintain the feed-forward 
nature and passes on its own feature-maps to 
any or all subsequent layers [19]. DenseNet is 
more efficient on some image classification 
benchmarks. 


input: | [(?, 256, 256, 3)] 
densenet121_ input: InputLayer 
((?, 256, 256, 3)] 


input: 290, 2590,.5 
densenet121: Functional ( ) 
(?, 8, 8, 1024) 


input: ?, 8, 8, 1024 
global_average_pooling2d: GlobalAveragePooling2D 
@, 1028 


input: | (?, 1024) 
es Ea 
@, 1024 


dense: Dense 


Fig.4. Pictorial view of DenseNet BC deep 
learning model 
The Figure 4 shows the Densenet BC deep 
learning model. For the illustration purpose 
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pictorial views of the DenseNet models has been 
presented in the Fig. 4. 

5. Results 

This section of paper presents results obtained at 
the time of training the models and_ after 
implementing the trained deep learning models on 
test dataset 


A) Basic CNN Model:- Per epoch training 
validation accuracy and loss of the basic deep 
CNN model have been presented in the Figure 
5 and Figure 6 respectively. 


B) VGG10 Model:- The accuracy of training and 
validation per epoch and loss of the VGG19 
model were recorded in Figures 7 and Figure 
8, respectively. 


C) DenseNet Model:-The training accuracy and 
loss per epoch of the DenseNet model were 
stated in Figures 9 and 10, respectively. 


D) Comparison of Basic deep CNN, VGG19 
and DenseNet:- The Table. 1.shows that the 
DenseNet is performing better than the basic 
deep CNN and VGG19. So, the DenseNet is 
suggested for the purpose of developing DR 
severity prediction system. 


model atcuracy 





Fig, 5.Basic deep CNN model per epoch 
accuracy. 


model loss 





0 2 4 5 a 10 1? 14 
epoch 


Fig. 6: Basic CNN Model per epoch loss 
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Table.1. Performance Comparison of Simple Training and validation accuracy 
CNN, VGGI19 and _ DenseNet modified 
architecture with training and_ validation 
accuracy 


Models/Acc. | Simple | VGG19 | DenseNet 
CNN 
0.7453 0.8354 0.9883 
Acc. 







0.7366 | 0.8199 | 0.8172 ; — Training acc 
Acc. —— Validation acc 


0 2 4 6 i 
Training and validation accuracy Fig. 9: DenseNet Model per epoch accuracy 
0.84 
— Training acc A AY Conclusions 
097. —— Validation acc 


The frequent mass DR screening is becoming very 
essential in the developed and _ developing 
countries. In this work we have constructed deep 
learning models like Deep CNN, DenseNet and 
VGGI19 ant trained using the publicly available 
APTOS Blindness Detection data. Also, we have 
applied weight 


Training and validation loss 
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—— Training loss 
— Validation loss 
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Fig. 7. VGG19 Model per epoch accuracy. 


Training and validation loss 


—— Training loss 
—— Validation loss 





Fig. 10.DenseNet Model per epoch loss 


method to overcome from the problem of training 
bias issue. The performance of the developed 
models evaluated using accuracy as performance 
metric. The training accuracy of basic CNN, 
VGG19, and DenseNet are 74.53% and 83.54% 
and 98.83% respectively. And finally DenseNet 
deep learning model is suggested for the purpose 
of prescreening of the DR. The future scope of this 
work is to make use of transfer learning to fine- 
tune the pre-trained network (DenseNet) 
parameters for image classification task. 
Fig. 8.VGG19 Model per epoch loss References 
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